Towards Automatic Detecting of Overlapping Genes - Clustered BLAST Analysis of Viral Genomes

نویسندگان

  • Klaus Neuhaus
  • Daniela Oelke
  • David Fürst
  • Siegfried Scherer
  • Daniel A. Keim
چکیده

Overlapping genes (encoded on t he same DNA locus but in different frames) are thought to be rare and, therefore, were largely neglected in t he past. In a test set of 800 viruses we found more t han 350 potent ial overlapping open reading frames of >500 bp which generate BLAST hits, indicating a possible biological fun ction . Interestingly, five overlaps with more t han 2000 bp were found, the largest may even contain triple overlaps. In order to perform the vast amount of BLAST searches required to test all detected open reading frames, we compared two clustering strategies (BLASTCLUST and k-means) and queried the database with one representative only. Our results show t hat this approach achieves a significant speed-up while retaining a high quality of the results (>99% precision compared to single queries) for both clustering methods. Future wet lab experiments are needed t o show whether the detected overlapping reading frames are biologically functional.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Polymerization of non-complementary RNA: Systematic symmetric nucleotide exchanges mainly involving uracil produce mitochondrial RNA transcripts coding for cryptic overlapping genes

Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging ur...

متن کامل

Using hidden Markov models and observed evolution to annotate viral genomes

MOTIVATION ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame gen...

متن کامل

Varicella Zoster Virus (VZV) Origin-Dependent Plasmid Replication in the Presence of the Four Overlapping Cosmids Comprising the Complete Genome of VZV

The Varicella-Zoster Virus (VZV) genome contains both cis-acting and trans-acting elements, which are important in viral DNA replication. The cis-acting elements consist of two copies of oriS, and the trans-acting elements are those genes whose products are required for virus DNA replication. It has been shown that each of the seven genes required for ori-dependent DNA synthesis of Herpes Simpl...

متن کامل

Broadening Gene Pool of Rice for Resistance to Biotic Stresses Through Wide Hybridization

Variability in the cultivated germplasm for economic traits such as resistance to rice tungro virus, sheathblight, yellow stem borer, drought and salt tolerance is limited. This necessitated search for the genes in secondary and tertiary gene pool of genus Oryza. Fortunately, wild species are an important reservoir ofuseful genes for resistance to major disease, pest and tolerance t...

متن کامل

Acquired Antimicrobial Resistance Genes of Escherichia coli Obtained from Nigeria: In silico Genome Analysis

Background: Antimicrobial resistance is a global problem with enormous public health and economic impact. This study was carried out to get an overview of acquired antimicrobial resistance gene sequences in the genomes of Escherichia coli isolated from different food sources and the environment in Nigeria. Methods: To determine the acquired antimicrobial-resistant genes prevalence, genome asse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010